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Abstract 

A recent line of work has focused on the use of low-density generator matrix (LDGM) codes 
for lossy source coding. In this paper, we develop a generic technique for deriving lower bounds 
on the rate-distortion functions of binary linear codes, with particular interest on the effect of 
bounded degrees. The underlying ideas can be viewing as the source coding analog of the clas- 
sical result of Gallager, providing bounds for channel coding over the binary symmetric channel 
using bounded degree LDPC codes. We illustrate this method for different random ensembles of 
LDGM codes, including the check-regular ensemble and bit-check-regular ensembles, by deriving 
explicit lower bounds on their rate-distortion performance as a function of the degrees. 

Keywords: Lossy source coding; low-density generator matrix (LDGM) codes; rate-distortion; 
random ensembles; MAX-XORSAT. 

1 Introduction 

The problem of lossy source coding is to achieve maximal compression of a data source, sub- 
ject to some bound on the average distortion. Classical random coding arguments show that a 
randomly chosen binary linear code will, with high probability, come arbitrarily close to the rate- 
distortion bound for lossy compression of a symmetric Bernoulli source However, such codes 
are impractical, as it is neither possible to represent them in an compact manner, nor to perform 
encoding/decoding in an efficient way. It is thus of considerable interest to explore and analyze 
the use of structured codes for lossy compression. A particularly important subclass of structured 
codes are those based on bounded-degree graphs, such as trellis codes, low-density parity check 
(LDPC) codes, and low-density generator matrix (LDGM) codes. One practical approach to lossy 
compression is via trellis-code quantization |13j . One limitation of trellis-based approaches is the 
fact that saturating rate-distortion bounds requires increasing the trellis constraint length [22], 
which incurs exponential complexity (even for the max-product or sum-product message-passing 
algorithms). Other work shows that it is possible to approach the binary rate-distortion bound 
using LDPC-like codes [T7] or nonlinear codes [11], albeit with degrees that grow at least logarith- 
mically with the blocklength. A parallel line of recent work [16\ [23l [H [21 [21] has explored the use 
of low-density generator matrix (LDGM) codes for lossy compression. These codes correspond to 
the duals of low-density parity check (LDPC) codes, and thus can be represented in terms of sparse 
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factor graphs. The results of this paper provide further insight into the effective rate-distortion 
function of this class of sparse graph codes. 

Focusing on binary erasure quantization (a special compression problem dual to binary erasure 
channel coding), Martinian and Yedidia [16] proved that LDGM codes combined with modified 
message-passing can saturate the associated rate-distortion bound. Various researchers have used 
techniques from statistical physics, including the cavity method and replica methods, to provide 
non-rigorous analyses of LDGM performance for lossy compression of binary sources [U [21 [2T]. In 
the limit of zero-distortion, this analysis has been made rigorous in a sequence of papers [5l [T9| [3l [7] . 
The papers [151 [Hj provide rigorous upper bounds on the effective rate-distortion function of various 
classes of LDGM codes, assuming the use of a maximum likelihood (ML) encoder. In terms of 
practical algorithms for lossy binary compression, several researchers have explored variants of the 
sum-product algorithm or survey propagation algorithms [H [HI [211 123] for quantizing binary sources. 

Previous rigorous analyses of the effective rate-distortion function of LDGM codes [151 [T3] 
under ML encoding have been based on the first and second- moment methods. Whereas the 
second moment provides a non-trivial upper bound on the effective rate-distortion function, the 
first moment method (at least in its straightforward application) yields a well-known statement — 
namely, that the rate must be larger than than the Shannon rate-distortion. This lower bound, 
though achieved for graphs with degrees that scale suitably with blocklength, is far from sharp for 
these sparse graph codes. Accordingly, the primary contribution of this paper is the development of 
a technique for generating sharper lower bounds on the effective rate-distortion function of sparse 
graph codes. At a high-level, the core of our approach can be understood as a source coding analog 
of Gallager's [9] classical result on the effective capacity of bounded degree LDPC codes for channel 
coding. Our main result (Theorem [T]) shows explicitly how, for fixed sequences of codes, the gap 
to the rate-distortion bound is controlled by a certain measure of the average overlap between 
quantization balls. We illustrate our approach in application to some random ensembles of LDGM 
codes, establishing how their effective rate-distortion performance compares to the Shannon limit 
for various bit and check degrees. We note that since this work was initially presented [B], Kudekar 
and Urbanke [12] have used related methods to establish lower bounds that hold for fixed codes, 
as opposed to the random ensemble analysis of this paper. 

The remainder of this paper is organized as follows. We begin in Section [2] with necessary 
background material and definitions for source coding, factor graphs, and low-density generator 
matrix codes, before stating and discussing our main results in Section 12. 3[ Section [3 is devoted 
to a number of basic results, applicable to any binary linear code. These results show how lower 
bounds on the effective rate-distortion can be obtained by suitably lower bounding the growth 
rate of the number of codewords with sufficiently low distortion. In general, this growth rate — 
represented as a certain average of the overlaps between codewords — is a complicated quantity to 
analyze, due to the non-uniform nature of the underlying random variable. Nonetheless, as we 
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show in Section [H it is possible to obtain explicit and computable lower bounds on the average- 
case performance of LDGM codes, using a graph-based certificate and ensemble averages to obtain 
explicit lower bounds on the relevant overlap, and hence on the rate-distortion function. This work 
was first presented in part at the Information Theory Workshop, Lake Tahoe [B]. 

2 Background and Statement of Main Results 

We begin with background on binary linear codes, lossy source coding, factor graphs, and random 
ensembles of low-density generator matrix codes. With these definitions, we then state our main 
results in Section [231 

2.1 Binary codes and lossy source coding 

A binary linear code C of block length n consists of a linear subspace of {0, 1}". One concrete 
representation is as the range space of a given generator matrix G G {0, l}*^^"*^ as follows: 

C = {x£ {0, 1}" I X = Gz for some z £ {0, 1}™ } . (1) 

The code C consists of at most 2'" = 2^^ codewords, where R = ^ is the code rate. 

In the binary lossy source coding problem, the encoder observes a symmetric Bernoulli source 
sequence S G {0,1}", with each element Si drawn in an independent and identically distributed 
(i.i.d.) manner from a Bernoulli distribution with parameter p = |. The idea is to compress the 
source by representing each source sequence S by some codeword x G C. When using a code 
in generator matrix form, one thinks of mapping each source sequence to some codeword x £ C 
from a code containing 2"^ = 2"^ elements, say indexed by the binary sequences z G {0, 1}™. 
The source decoding map x ^ S{x) associates a source reconstruction S{x) with each codeword 
X G C. The quality of the reconstruction can be measured in terms of the Hamming distortion 
d{S,S) = Y17=i ~ ~ ~ "^lli- With this set-up, the source encoding problem is to find 
the codeword with minimal distortion — namely, the optimal encoding xml '■= aicg mm d{S (x), S). 
Classical rate-distortion theory [4j dictates that, for the binary symmetric source, the optimal trade- 
off between the compression rate R and the best achievable average distortion D = ¥,[d{S, S)] is 
given by 

R{D) = l-H{D), (2) 
where H{D) : = —DlogD — (1 — D) log(l — D) is the binary entropy function. 
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2.2 Factor graphs, LDGM codes and random ensembles 



Given a binary linear code C, specified by generator matrix G, the code structure can be captured 
by a bipartite graph, in which square nodes (■) represent the checks attached to the code output 
bits Xi (or rows of G), and circular nodes (O) represent the information bits (or columns of G). 
For instance, Fig. [1] shows the factor graph for a rate i? = | code in generator matrix form, with 
n = 12 checks (each associated with a unique source bit, top of diagram) connected to a total of 
m = 9 information bits (bottom of diagram). The edges in this graph correspond to I's in generator 
matrix matrix, and reveal the subset of bits to which each information bit contributes. The degrees 
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and overall rate i? = — = |. The code illustrated is a bit-check- regular code from the ensemble 
€{dc, dv), with bit degree d„ = 4 and check degree dc — 3. 



of the check (respectively) variable nodes in the factor graph are dc = 3 and dy = 4 respectively, 
so that the associated generator matrix G has 3 ones in each row, and 4 ones in each column. 
When the generator matrix is sparse, then the resulting code is known as a low-density generator 
matrix (LDGM) code. Some authors also refer to codes in which the degrees scale sublinearly with 
blocklength as low-density; in this paper, we reserve this term only for codes with degrees bounded 
independently of the blocklength. 

Our primary contribution is a technique for generating lower bounds on the rate-distortion 
functions of binary linear codes. We illustrate this method concretely by application to two different 
random ensembles of LDGM codes. The check-regular LDGM ensemble with degree dc, denoted 
by <t{dc), is formed by fixing a check degree d^ and having each of the n checks connect to dc 
of the m information bits uniformly at random (with replacement). Doing so generates a set of 
information bits with a random degree sequence, one which asymptotically obeys a Poisson law 
with mean dc/R- This particular ensemble of random graphs is the canonical choice in studying 
random fc-SAT, XORSAT and other satisfiability problems (e.g., [18^ [5l [7]). Note that the problem 
of LDGM encoding — finding the sequence of information bits to minimize Hamming distortion — is 
equivalent to an instance of a MAX-XORSAT problem. 

On the other hand, the bit- check-regular LDGM ensemble, denoted by <t{dc,dv), is specified by 
a pair of degrees {dc, d^), one for the checks and one for the bits. The code ensemble consists of all 
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codes in each each check has degree exactly dc and each bit has degree exactly dy. For example, the 
code illustrated in Figure [His bit-check-regular with {dc,dv) = (3,4). This ensemble is the LDGM 
analog of the Gallager regular ensembles [9j of LDPC codes. 



2.3 Main results 

Given a C, define for each binary string u G {0, l}", the integer- valued quantity 

N{u,D-C) ■■= \{ze{0,l}"'\Gz£Mniu;D)}\, (3) 

that counts the number of information sequences that generate codewords within the Hamming 
D-hall Mnis;D) centered at s. We let Eu[l/N{U, D;C)] denote the average of l/N, where U is 
uniformly distributed over the ball Mn{0;D). With these definitions, we have 

Theorem 1. Consider a fixed sequence of codes C = C„ of rate R, indexed by blocklength n. If for 
sufficiently large n, the rate- distortion pair {R, D) satisfies the hound 

R < l-H{D)-^\og¥.u[l/N{U,D-C)], (4) 

then the code family cannot achieve distortion D. 

Since by definition, we have N{u, D;C) > 1 for any u € Mn(0;D), we always have the trivial 
lower bound —\ogE,i/[l/N(U,D;C)] > 0, under which the bound ^ reduces to the Shannon rate- 
distortion bound. Indeed, this bound would be asymptotically tight for a random (high-density) 
linear code. For other codes, obtaining more refined statements requires exploiting specific aspects 
of the code structure. 

Theorem [T] holds for any fixed (deterministic) sequence of codes. In order to use it to establish 
lower bounds on the rate-distortion function of given code sequences, one needs upper bounds 
on the quantity ^logKu[l/N{U, D;C)]. The analysis of this quantity is facilitated by considering 
random ensembles of codes. In particular, in Section [H we analyze the behavior of the lower 
bound ^ for different random ensembles of codes, thereby obtaining explicit lower bounds as 
corollaries of Theorem [TJ We begin with the check-regular ensemble: 

Corollary 1. With probability converging to one with the blocklength, a LDGM code C{dc) randomly 
drawn from the check-regular ensemble ^{dc) with check degree dc can only achieve those rate- 
distortion pairs (R, D) that satisfy the bound 
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1 {l-D)dc , 
l-2-P( ^) 



> l-H{D). (5) 



For any finite degree dc, the minimal rate R satisfying the relation ([5]) is strictly bounded away 
from the Shannon rate- distortion bound R{D) = 1 — H{D). 
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Figure 2. Excess rate, computed as a percentage of the Shannon limit R{D) = 1 — H{D), versus 
bit/check degrees for different random ensembles, (a) Plot of the lower bound from Corollary [1] for 
the check- regular ensemble as a function of the check degree dc- (b) Plot of the lower bound from 
Corollary [2] for the bit-check-regular ensemble as a function of the bit degree dy . 



Figure [2^a) plots the minimal rates R satisfying the bound ([5]), as a function of the check degree 
dc, for two different distortions. Although Corollary [1] bounds the rate-distortion function away 
from the Shannon limit for any finite degree, note that the performance loss decreases rapidly as the 
check degree dc is increased. However, we do know that the bound ^ is not sharp (in particular, it 
is loose in the special case D = 0), and we suspect that our analysis could be refined in a a number 
of places. 

We also have a complementary result for the case of bit-check-regular ensembles: 

Corollary 2 (Bit-check-regular ensembles). With probability converging to one with the blocklength, 
a LDGM code C{dc,dy) randomly drawn from the bit- check-regular ensemble <t{dc,dy) with check 
degree dc and bit degree dy can only achieve those rate- distortion pairs {R, D) that satisfy the bound 
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1— m^ax mm{g{6,(3), {l — 5)(3} 



where P : = ^ (^Y\ and g{6,p) : = \0g2ie) 
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2d„^ R(l-D) 



> l-^(^), (6) 



For any finite bit degree dv, the 



2 \dy 

minimal rate R satisfying the relation ([6]) is strictly bounded away from the Shannon rate- distortion 
bound R{D) = 1-H{D). 

Figure [2] (b) again illustrate^ the excess rate guaranteed by Corollary [5] for different distortions, 
this time as a function of the bit degrees d^ in this construction. At least on the basis of the tight- 



^ Strictly speaking, the plots in Figure [5Jb) is misleading, in that not all rates shown can be achieved for every 
degree. For instance, for degree dv = 3, it is only possible to achieve rate 2/3 with a regular degree distribution. 
However, we gloss over this technical issue for the sake of clearer comparison with Figure [2j a). 
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ness of these bounds, the bit-check-regular ensemble appears to have rate-distortion performance 
superior to the check-regular ensemble, which is to be expected intuitively. 

Note that in both Corollaries [1] and [21 the bounds are guaranteed to hold with high probability 
over a choice of random code from the ensemble. Therefore, although these results guarantee that 
almost all codes from the given ensembles are bounded away from the Shannon limit, they do not 
rule out the possibility that there exists some fixed code of the specified type that achieves the 
Shannon limit. In the initial conference version of this paper [6], we conjectured that no such codes 
exist — i.e., that every code with degrees chosen according the specified ensemble must satisfy the 
given bounds. In recent work, Kudekar and Urbanke [12j used related methods to establish lower 
bounds that hold for fixed bounded degree codes, as opposed to the random ensemble results given 
here. 

3 Tools for lower bounding rate-distortion 

In this section, we develop the analytical tools that underlie the proof of Theorem [H as a prelude 
to our analysis of random ensembles to follow in Section SI We begin by describing a certain type 
of D-ball encoder, in general sub-optimal relative to the optimal encoder, but more amenable to 
analysis. As a first step, we establish the asymptotic optimality of this D-hall encoding. We then 
exploit this encoder to prove Theorem [TJ 

3.1 Maximum likelihood and D-ball encoding 

Recall that any binary linear code C C {0, 1}" can be characterized by a generator matrix G E 
{0, 1}"^™^ such that any codeword x G C is of the form x = Gz, where z G {0, 1}"^ is a sequence of 
information bits. Given some source sequence S G {0, 1}" of symmetric Bernoulli random variables, 
suppose that we quantize the source using the code C. We use Z G {0, l}™ to denote the random 
information sequence, with associated codeword S = GZ, to which the random sequence S is 
quantized. 

The optimal encoding map, from source sequences S to codewords Z, is the so-called maximum 
likelihood (ML) encoder. Given a source sequence, it computes the set of information sequences 
that generate codewords closest to S in Hamming distance, and outputs one of them uniformly at 
random — say Zml £ argmin^gjg i}m{||G2; © 5'||i}. The associated minimal distortion is a random 
variable, defined as 

dn{S;C) := = - min {||Gz©5||i} = -||GZml © (7) 

n zg{o,i}™ n 

This ML encoder is optimal in that its expected distortion E[(i„(5;C)] is minimized over all en- 
coders. 
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Despite the optimality of ML encoding, it is more convenient for theoretical purposes to analyze 
the following D-ball encoder. For any fixed target distortion D G (0, |), define the Hamming ball 
of radius D around the source sequence S as follows: 

Mn{S;D) := {x G {0, 1}" | ||5 © x||i < £»n} . (8) 

We say that the D-ball encoder succeeds if and only if the intersection B„(S';Z)) n C is non- 
empty, in which case it chooses some information sequence Zj^b uniformly at random from the set 
{z G {0, 1}"* I Gz G Mn{S;D)}. Otherwise, the encoder fails, and we set Zdb = z*, where z* is 
some arbitrary non-zero sequence. We claim that this D-ball encoder is asymptotically equivalent 
to the ML encoder. 

Lemma 1. For any binary linear code, the following two conditions are equivalent: 

(a) for all e > 0, the probability of success under {D + e)-ball encoding converges to one as 
n — ^ -|-oo. 

(b) for all S > 0, we have 'E[dn{S; C)] < D + 6 for all suitably large blocklengths n. 

Proof: We first show that (a) implies (b). Given any fixed 5 > 0, set e = 5/2 in part (a), 
and consider the associated {D + |)-ball encoder. Setting p„ = P[(D -|- (5/2)-ball success] and 
iSdb = G-^DB, we have 

1 5 \ 

-E[||5db©5||i] < {D + -)pn + {l-Pn)^ 

< D+^[l-pn+PnS] 

< D + 6, 

where the final inequality follows if wc can ensure that pn > . Since p„ ^ 1 by assumption, 
this condition can be met by choosing n sufficiently large. Finally, since ML encoding yields the 
minimal average distortion, we have E[d„(S';C)] < iE[||5DB © S'lli] < £> + -5, which is the claim 
(b). 

We now prove that {not (a)} imphes {not (b)}. Suppose that for some e > 0, the encoding 
success probability Pn = P[(D -|- e)-ball success] does not converge to 1. Then liminfp„ < 1, so 
that by taking subsequences if necessary, we may assume that for all sufficiently large n, the failure 
probability satisfies 1 — p„ > i/ for some u > 0. Since the {D + e)-ball encoder can fail only if there 
are no codewords within normalized distance {D + e) of the source sequence, this statement implies 
P[d„(5;C) >D + e]>iy. 

Next, we claim that the ML distortion dn{S;C) is concentrated around its expected value. 
Consider the martingale sequence based on exposing the source bits in the order Si, S2, ■ ■ ■ , Sn, 
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and defining the sequence of random variables Zq = E[dn{S; C)], and 

Zk := lE,[dn{S;C) \ Si, . . . Sk], for A; = 1, 2, . . . , n, 

sucli that Zn = dniS;C). Note that changing one bit Si changes the (normalized) distortion 
dn{S; C) by at most 1/n, which means that dn{S; C) is a c-Lipschitz function with constant c = 1/n. 
Therefore, by applying the Azuma-Hoeffding inequality [10] to this martingale sequence yields that 
F[\dn{S;C) -E[dn{S;C)]\ > e] < 2exp (-^)- Therefore, for any constant e > 0, we have 



lim P[|d„(S;C)-EK(5;C)]| >e] = 0. (9) 

n— >+oo 

Using the sharp concentration ([9]), we see that the bound P [dn{S;C) > D + e] > u implies that 
E[d„(S';C)] > D + e/2 w.h.p. Hence, we have established the existence of some e > for which 
there exists an infinite sequence of blocklengths n along which E.[dn{S; C)] > D + e/2, thus implying 
{not (b)}. ■ 



3.2 Proof of Theorem [T] 

We are now equipped to prove Theorem [H If the code C achieves average distortion D by some 
encoding method, then Lemma [1] implies that the D-ball encoder must achieve this distortion. 
Letting A{S, D; C) be the event that the Z)-ball encoder succeeds for source sequence S, denote 

Pn = ¥[A{S,D;C)]. 

Recall the operation of the D-ball encoder: when it succeeds — that is, for any source sequence S 
such that N{S, D; C) > 1 — the encoder chooses an information sequence Z uniformly at random 
from all information sequences satisfying GZ = S. Note that there are N(S, D; C) such choices, by 
the definition ([3]) of N. Otherwise, if £)-ball encoding fails, the encoder simply chooses some fixed 
non-zero information sequence z* ^ 0. 

By definition of this decoder, for any source sequence s for which the D-hall encoding succeeds, 
we have 



1 



if GzeMn{s;D) 



^Z = z\S = s] = {^^^^ (10) 

otherwise. 



We now compute this PMF of the random variable Z. 
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(11) 



Lemma 2. The PMF of Z has the form 

q{D;C) if z ^ z*, and 

q{D;C) + {l-pn) forz = z*. 

where q{D-C):= ^ ^^^-^2"" and p„ = ¥[A{S, D; C)] . 

MeB„(0;D) 

Proof: For any z G {0, 1}^ (not equal to the special sequence z*), we have 

F[Z = z] = Y.F[Z = z\S = snS = s]= Yl m/z^.C) '"^' 

s {s\ \\Gzes\\i<Dn} ^ ' ' ^ 

using the form of the PMF (jlOp . and the fact ¥[S = s] = 2~" for all source sequences. 
Let t ^ z* he any other information sequence. Then 

^^^ = '^ = ^ W^)''^ 

{s\ ||G2®s||i<Dn} ^ ' ' 

V I 2-" 

^ N(s,D;C) 

{s\ \\Gm(seG{zet))\\i<Dn} ^ ' ' ' 

, , „ ^„ ^ N{s' (BG{z(Bt),D;C) 

{s'\ \\Gt(Bs'\\i<Dn} ^ ^ /' ' ; 

where we have defined s' : = s © G(z © t). 

We now claim that from the symmetry of the code, for any codeword xq £ C and source sequence 
s E {0,1}", we have N{s,D;C) = N{s © xo,-D;C). Indeed, suppose that N{s,D;C) = k, with 
codewords xi, . . . ,Xk such that \\xi © s||i < Dn for i = 1, . . . , A;. Then the codewords x'^ : = Xi (B xq 
satisfy 

||x.©(s©xo)||i = ||(xi©xo) © (s©xo)||i = ||xi©s||i < Dn, 

so that N{s © xq, D; C) > k, and by symmetry N{s © xq, D; C) = k. 

Consequently, we have N{s' © G{z © t), D; C) = N{s', D; C), so that for z ^ z*, we have 

]pr9 _ ^1 _ \ ^ 1 2~" 

^ ^ ^ iV(s'©G(z©t),i?;C) 

V ^ 2"" 

. , n ^„ , iV(s',Z);C) 

{s'l ||Gi®s'||i<Dn} V ' ' ^ 

This statement holds for any t ^ z* , so that setting t = yields the first part of the claim (jll|) . 
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Finally, for z = z*, we have 

nZ = z*] = q{D;C)+ Yl 2"" = g(Z); C) + (1 - 

{s\A{s,D;C) not true} 

which completes the proof. 



We have expressed the probability of selecting a compressed sequence z (and hence codeword) , 
as a function of the overlaps q{D;C) between the D-balls of codewords. The key point now is that 
if there are large overlaps (i.e. if is large for many sequences), then more codewords will be 
needed to cover the total number 2" of possible binary sequences, and hence there will be a rate 
loss. This intuitive argument can be made formal by observing that the PMF of Z needs to be 
normalized — namely, we must have^ ¥[Z = z] = 1. Using the PMF of Z from Lemma[2l we obtain 
2'^^q{D;C) + (1 — Pn) = 1) or equivalently (upon solving for p„): 

_ o^-f^ - 2"" 

u&n{0;D) 

Taking logarithms yields 

1 , „ 1 . NT^ 1 



llogp„ = R-l + ^log Y 

mGB„(0;D) 

1 , / n \ 1 , 1 



DnJ n ( " ) ^ N(u,D;C)' 

where the last equality holds by adding and subtracting log (^^) , corresponding to (the logarithm 
of) the total number of binary sequences |{]B„(0; -D)}| within the D-ball centered at zero. 

By construction, the last term is just an expectation over a uniformly selected sequence \J in 
]B„(0; D\ as defined prior to the statement of Theorem [H so that we have 

1 \ f n \ \ 

-logp„ = i?-l + -log +-logEc7[l/iV(C/,Z);C)]. (12) 

n n \unj n 

This expression is the exact exponent of the success probability of the D-ball decoder and might 
be of independent interest. If this exponent is negative, the probability of success of the D-ball 
encoder will vanish exponentially. 

Our final step is to apply standard asymptotics for binomial coefficients [4] — namely, : 



H{D) ±0(1). Substituting into equation (jl2p . we obtain that the probability of D-ball success van- 
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ishes exponentially quickly if 

R < l--log\{Mn{0;D)}\--logEu[l/N{U,D-C)]- (13) 

By Lemma 1, since the D-ball encoder fails for this rate-distortion pair, the ML encoder must also 
fail to achieve average distortion D, which establishes Theorem [TJ The last term, that describes 
the expected overlaps of L>-balls, is the only term that depends on the specific code used, and 
corresponds to the excess rate due to the code suboptimality. 



4 Analysis over random ensembles 

From the lower bound we see that any loss relative to the Shannon limit is captured by the 
excess rate coefficient — namely, log¥.ij[l/N{U, D; C)]/n. For a fixed code C, a challenge associated 
with analysis of this quantity is the possible non-uniformity in the cardinalities 



N{u,D-C) 



|{ze{0,l}'^ I Gz£Mniu;D)}\ 



As a concrete example, consider the rate R = 1/2 code with {m,n) = (2,4) and codewords 



C 







110 



1 



1 1 



:]} 



With Dn = 1, a simple calculation shows that for this code. 



N{u,D;C) 



1 for u 
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2 otherwise. 



Consequently, the quantity 1/N{u, D;C) is directionally biased towards the quantization noise 



sequence u 
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Although evaluating the excess rate coefficient for a fixed code appears difficult, if instead we 
view C as a random variable, drawn from some ensemble £ of codes, then the excess rate becomes 
a random variable, as a function of the random code C. We can then consider ensemble-based 
analysis of this random variable. 



4.1 Concentration and graph-based certificate 

We begin by stating conditions involving expectations and concentration that are sufficient to yield 
bounds (holding with high probability) on the rate-distortion of a randomly drawn code C In 
this analysis, we consider random ensembles in which the code bits are exchangeable, meaning 
that the probability distribution is invariant to permutations of the labelings of the code bits. For 
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instance, this exchangeability holds for the check-regular and bit-check-regular ensembles defined 
in Section I 



Proposition 1. Given an exchangeable ensemble (t, define the random variable 

W{D;C): = |{zG{0,l}™ | {Gz)i = for all i ^ {1,2, Dn}}\. (14) 

and suppose that yK[logW{D;C)] > a{€.) > 0, and moreover that for all 6 G (0,1), 

P[logVF(Z);C) < (1 -(5)a(e:)n] < Kl-f^^^"", (15) 

for some positive constant K = K{<t) independent of blocklength, and positive function f : [0, 1] 
(0, oo). Then with probability converging to one as n ^ +oo, a randomly drawn code C cannot 
satisfy any rate- distortion pair (R, D) for which 



R < 1- H{D) + max min{/(5), (1 - 5)a{€)} - o(l). 

<5e(o,i) 



(16) 



Proof: For each u G B„(0; D), define the random variable 

M{u,D;C) := |{z G {0, 1}™ | Gz G B„(0; i:)) n ]B„,('u; D)}| 



(17) 



and note that 



1 < 1 



N{u,D;C) — M(u,D 

with the concavity of the logarithm, we have 



by construction. Using this fact, and applying Jensen's inequality 



logEu[l/NiU,D;C)] 



n 



< E, 



c 



logEu[l/MiU,D;C)] 



n 



< 



logEcEu[l/M{U,D;C)] 



n 



For an exchangeable ensemble of codes, the distribution of M{u, D; C) depends only on Hamming 
weight ||?x||i, so that we can write 

\ogEcEu[l/M{U,D;C)] _ l^^^T^2^€[l/M{u\D;C)] 



n 



n 



n \ 



where for each A; = 0, 1, ... , Dn, the vector u'^ has Hamming weight = k, with = 6 and 

for A; > 1, 



1 for i = 1, . . . , k 



otherwise. 

Now observe that by the definition ^ of W and u'', we have W{D; C) < M{u'',D; C) for all 
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k = 0,1, . . . , Dn, whence 



logEu[l/N{U,D;C)] 



< 



logEcKu[l/M{U,D;C)] 



< -logEc[l/W{D;C)]. 



n 



n 



Combining this bound with Theorem [H we have shown that the ensemble rate-distortion must 
satisfy 



To conclude the proof, we now exploit the given assumptions on the behavior of \ogW{D;C). 
Defining the event A[5) : = {l/W{D;C) > 2-"(i-'5)°(e:)}, we have F[A{6)] < /r2-"^('5) from the 
concentration (jlSp . Since 1/W{D;C) < 1, we can write 



Proposition [T] suggests a general procedure for proving lower bounds on the effective rate- 
distortion of different random ensembles of codes, by controlling the behavior of the random variable 
W{D;C). In order to do so, it is convenient to make use of the following graph-based certificate. 
Given the factor graph describing the generator matrix G of the code C, suppose that the last 
(1 — D)n checks are labeled as fixed, denoted by S^^. We use Ni^^""; C) to denote the subset of 
information bits connected to at least one check in S^^, and let T^'''^'^(C) denote the complement of 
these fixed information bits. See Figure [3] for an illustration of these concepts. 

The key property of this construction is the following: suppose that we set Zi = for all indices 
i G N{E>^^; C). With this setting, the information bits zj associated with indices j G T^''^*^(C) can be 
altered arbitrarily, while still ensuring that the codeword Gz still satisfies {Gz)£ = for all £ G S*^^. 
The number of free bits F{C) : = |{T^''''^(C)}| thereby provides a lower bound on logW{D;C), one 
which is relatively easy to analyze for random ensembles. This graph-based certificate suggests the 
following two-stage approach for generating lower bounds: 

(a) First establish a lower bound on E[F(C)], and thus a lower bound on E[log C)]. 

(b) Next show that F{C) and consequently log W{D; C) is larger than this lower bound with high 



R > l-H{D)--logEc[l/W{D;C)]-o{l). 




< -mm{f{5), {l-5)a{<t)} +o{l) 



since with K independent of blocklength, we have log K/n = o(l). 



probability. 
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3 fix 



n 



QQQOOOOOOOOO 




NiS^""; C) 



free 



Figure 3. Factor graph of an LDGM code illustrating the fixed checks S^^, information bit neighbors 
N{Ei^^; C) of fixed checks, and the free information bits T^''°''. 



4.2 Lower bounds for specific ensembles 

We now illustrate this approach by using it to prove Corollary [1] for the check-regular ensemble 
£((ic)) and then to prove Corollary [2] for the bit-check-regular ensemble (L{dc,dy). (See Section [2.21 
for definitions of these ensembles.) With appropriate modifications, the underlying ideas of our 
approach should be more generally applicable, for instance to LDGM codes with irregular degree 
distributions. 



4.2.1 Proof of Corollary [T] 

Recall the check-regular ensemble of LDGM codes, denoted by ^(dc): it consists of codes C{dc) 
with n checks and m information bits, constructed by having each check select dc bits uniformly 
at random (and with repetition). We have the following result concerning the expected number of 
free bits: 

Lemma 3. The expected number of free bits over the check-regular ensemble grows linearly 



E[F(C)] 



m 



m 



(l-D)ndc 



: P 



(18) 



and moreover, F{C) is sharply concentrated, in that for all 6 £ (0, 1), we have 



P[F(C) < (1-5)E[F(C)]] < 2exp<^ -m[(l -/?) +5/3]log 



1 + 



6(3 ' 
1^ 



(19) 



Proof: Any particular information bit is adjacent to a particular check with probability { "^^^ 
and this event occurs for each of the (1 — D)n fixed checks independently. The probability that a 
particular bit is free (i.e. non-adjacent to S'^^) is simply (3 as defined in equation (jlSp . and the the 
expected size of T^''*'^ is simply E[F(C)] = m/? as claimed. Now we need to show that the random 
variable -F(C) is concentrated around its mean. Since F{C) is a sum of m i.i.d. Bernoulli variables 
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with mean /3, Sanov's theorem [3] yields that for any 6 G (0, 1), 



F[F{C) <{l-6)(3] < 2 exp{-m KL((l-5)/5||/3)} 



where KL(a||6) is the Kullback-Leibler divergence for Bernoulh variates. Noting the fower bound 

6p 



KL((l-<5)/3||/3) > [(l-/3) + <5/?]bg 1 + 



1-/3 



claim (fTOll follows. 



Using Lemma [3] and Proposition [H we can now prove Corollary [TJ Assume that for some pair 
{R, D) and code C drawn from the check-regular ensemble, the source encoder is successful. Recall 
that F{C) is a lower bound on \ogW {D]C). Using the fact that m = Rn, equation (jlSp implies 
that 



-E[logW(D;C)] > Rll 

n \ m , 



a{(t) = R(3. 



Moreover, we have 



P[log W{D- C) < (1 - 5)a{C)n] < P[F(C) < [1 - 5) p m] 

= P[F(C) < (1 - (5)E[F(C)]] 

< 2 exp i -m [(1 - /?) + 6P] log 



1 + 



using equation (fT9]) . Consequently, the hypotheses of Proposition [T] are satisfied with K = 2, 



/5=(1-^) 



1 N(l-D)ndc 



, a(^) = R(3, and 



/(<5;/3) := log^{e) R [{1 - (3) + 5/3] log 
so that Proposition [T] implies that 



1 + 



sp 

1^. 



Rg{S;(3), 



R > 1-H{D) + R max mm{g{6;P), {l-6)p}-o{l) 
5e(o,i) 

> l-H{D) + ^Rp-o{l), 



(20) 



where it can be verified numerically that for 6 = 0.5, we have g{6; /3) > (1 — S)/3 for all (3 £ [0, 1]. 
Finally, a standard Poisson limit yields 



lim 

n— »+cx> 



^ Rn) 



(l-D)ndc 



exp 



(1 - D)dc 
R 
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so that P > exp iP'^'^ j ~ ^(l)- Therefore, for all n sufficiently large, the pair {R, D) must 

satisfy 

R 

as claimed. 



1 / (1 - D)d, 



> l-^(^), 



4.2.2 Proof of Corollary [2] 

We now turn of the proof of Corollary [21 concerning the effective rate-distortion function of the 
bit-check-regular ensemble ^{dc,dv) of codes. We begin by addressing the expected value and 
concentration of the random variable -F(C) for this ensemble. 

Lemma 4. The expectation of F{C) grows linearly in blocklength: in particular, it is lower bounded 



E[F(c)i ^ <u/«y- ^^^j 



as 



m 2 \dy / 

Moreover, it is sharply concentrated in that for all 5 £ (0, 1), 

P[F(C)<(1-<5)E[F(C)]] < 2exp|-m52^^-2^^^— -^|. (22) 

Proof: Any code in the <i{dc, dy) ensemble is characterized by a set of dcU = d^m edges, matching 
the n checks to the m information bits. A random code is generated by selecting a permutation vr 
of the dcU edges, uniformly at random from all {dcn)\ such permutations. For a fixed information 
bit, let Ngood denote the number of permutations where all the dy edges of a particular information 
bit are adjacent to non-fixed checks. Since there are n(l — D) fixed checks and Dn free checks, 
there are n{l — D)dc edges adjacent to fixed checks. Consequently, the probability that a particular 
information bit z G {1, . . . , m} is free is simply q = A'good/ (c^c'i-)'- 

To determine q more explicitly, we count the number A'good of permutations that leave a partic- 
ular information bit i free. Since there are DdcU (labeled) edges adjacent to free checks, there are 
(J^^^^^dyl ways for the given information bit to connect all of its dy outgoing edges to such checks. 
The remaining ndc — dy edges can be permuted arbitrarily without affecting the connectivity of the 
given information bit, which produces another factor of {ndc — dy)l. Overall, we conclude that 

In order to show that the expectation E[F(C)] scales linearly, it suffices to show that q is lower 
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bounded by a constant. We use the following bounds [20] that follow from Stirling's approximation: 



/ 7Tl \ ^ / / 7TI \ " " 

V27rm < ml < 2V27rm . (24) 

We also require a lower bound on the binomial coefficient (^) ; one such bound is 



> [j) -C(k),f (25) 



where C{k) = (l/k)^ for all positive integers k. Using these bounds, we obtain 

(ndc)! < 2\/2TTndc f — , and 



Consequently, we have 



> 



{d,)\C{d,){Dd,)''^n''^ {l-k^y"^ ^27T{ndc-d,) 



2y/2n{ndc 



We have V^^^^^^ _ i _ g^g ^ with dy and dc fixed, so that after some further 

2^2TT{ndc) 2 \ ' V c , 

algebra, we obtain 



1 '^^^ ^d,.\ / , ^ A dv 



2 0(1) \ {C{d,) {d,)\ D'^^j exp(4) (1 ^ ^ 



nd. 



o(l)| (c(d,)(d,)!Z)^''){l-o(l)}, 



1 

2 

where the final line follows since ( 1 — -j^ j converges to e " as n tends to infinity. Overall, 

since the quantity C{dy) {dy)\ D'^^ = {dy)\ stays bounded from above for all d^ > 2, we 

obtain the final lower bound 

as n ^ +00. Since there are m information bits in total, this bound with linearity of expectation 
establishes the lower bound ([2T]) on the expected value. 

Finally, we establish the concentration (j22p of F{C) for this ensemble. By definition, the 
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random variable F{C) is completely specified by the edge sets of fixed checks indexed by the set 
E>^^ of cardinality (1 — D)n, as in Figure El For i = 1, . . . , (1 — D)n, let Vi he a random variable 
specifying the edge set of fixed check i G S^^, and define the martingale sequence Zq = K[F{C)], 
and Zi = E[F(C) | Fl, . . . Vj], such that V(i„£))„ = F{C). Since each check has degree dc, we have 
the bound — Zi\ < dc- Therefore, by the Azuma-Hoeffding inequality pO], we have 

F[\F{C) -E[F{C)]\ >e{l- D)n] < 2exp {-ne^{l - D)/{2dl)} . 

Setting e = Sjj^^ = Sj^^ yields that 

,2 P'R' (l-D ) 
{l-Df 2dl 



[F{C) < (1 - (5)E[F(C)]] < 2 exp |-n(5' 
= 2 exp < —m6' 



2 exp <v — mJ^ 



2d2 (1 - D) 



2dJR{l - D) 



where the final step uses the relation dc = Rdv 



Using Lemma H] and Proposition [H we can now prove Corollary [2j Assume that for some pair 
(i?, D) and code C drawn from the check-regular ensemble, the source encoder is successful. Recall 
that F{C) is a lower bound on logW{D;C). Using the fact that m = Rn, equation (j2ip implies 
that 

-E[logW{D;C)] > := i?^ f^V" := = R(5. 

Moreover, we have 



\ogW{D;C)<{l-S)a{C)n] < F[F {C) < {I - 5) [3 m] 

= P[F(C) < (1 - 6)E[F{C)]] 
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< 2 exp <J — „ 

^ 2^2^ (l-D) J 

using equation (|22p . Consequently, the hypotheses of Proposition[T]are satisfied with [3 — ^ ( — 
a{(t) = Rp, K = 2, and 

fi6;P): = Rlog2{e) 



2 \dv 



dv 



P' 



24 -R(l - D) 



Rg{6;P). 
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Applying Proposition [H we conclude that 

R > 1-H{D) + R max mm {g{6; (3), {1-6)13} -o{l), (27) 

<5g{0,1) 

thereby establishing the claim of Corollary [2l 

5 Discussion 

We developed a technique for generating lower bounds on the effective rate-distortion function of 
sparse graph codes. The basic underlying ideas are the source coding analogs of Gallager's |9j clas- 
sical work on the effective channel capacity of bounded degree codes. Our main result (Theorem [TJ 
provides a generic lower bound on the best possible distortion achievable by any family of rate 
R codes. The essential object is the excess rate function, corresponding to a certain measure of 
the average overlap between adjacent codewords. Using this theorem to obtain lower bounds for 
specific code families requires methods for computing or lower bounding this excess rate term. In 
this paper, we we illustrated this approach by obtain lower bounds for random ensembles of sparse 
graph codes, including check-regular ensemble and the bit-check-regular ensembles of LDGM codes. 
We note that the basic ideas are more generally applicable to other sparse code ensembles, such 
as LDGM families with prescribed bit and check degree distributions. Moreover, recent work by 
Kudekar and Urbanke [12] has shown how similar ideas can be used to obtain lower bounds on 
fixed code sequences, as opposed to the random ensembles considered here. 
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